Hello there! 👋 We're excited to take you on an insightful journey through our interactive notebook designed to explore drought events across the globe over time. Our main focus will be on understanding how droughts have varied both geographically and temporally.
To reach this goal we need to recognize and analyze drought occurrences worldwide from the year 1940 up to the present. We will do this by examining the Standardized Precipitation Evapotranspiration Index (SPEI) values, which help us understand moisture deficit better.
📌 Since each of you has different needs and interests, we've made the notebook as interactive as possible. We'll guide you through the cells, but you'll have the freedom to choose what to focus on: the geographical area, the type of index aggregation, and the time period.
The SPEI is a powerful index used by scientists to determine drought conditions. It considers both precipitation and evapotranspiration (the sum of evaporation and plant transpiration from the Earth's surface to atmosphere) to give a standardized measure of moisture adequacy in different regions and times. You can find more info on the dedicate page of our handbook.
The data we will use comes from ERA5, one of the most comprehensive atmospheric data services available. Specifically, we are working with 'nc' files, which are a type of data file used for storing complex scientific data in a format that can be accessed and processed efficiently (to get deeper see the dedicate page).
Let's dive in and start our exploration to better understand the patterns and impacts of droughts around the world! 🌍
In this notebook, we will explore drought events from various points of view:
Before we dive into the data analysis, we need to ensure our notebook has all the necessary tools and libraries. The following cells involve installing various Python packages that will help us manipulate data, create visualizations, and interact with our notebook more effectively.
For more detailed information on these steps, you can refer to the Setting Up chapter in the handbook. Alternatively, you can simply run these cells and proceed to the first analysis section.
So let's start by installing the packages:
!pip install numpy
!pip install xarray
!pip install netCDF4
!pip install "dask[complete]"
!pip install folium
!pip install matplotlib
!pip install plotly
!pip install -U kaleido
!pip install ipywidgets
Now we enable the IPython widget extensions:
!jupyter nbextension enable --py widgetsnbextension
!jupyter labextension install @jupyter-widgets/jupyterlab-manager # only for JupyterLab environment
and import thenecessary Python libraries and modules that we'll use throughout our analysis.
We need to import also 4 custom modules from the utils folder: widgets_handler, coordinates_retriver, data_preprocess and charts. These modules contain custom functions tailored to handle widgets, retrieve coordinates, preprocess data, and create charts.
from ipywidgets import Layout, Dropdown, widgets
from IPython.display import display, clear_output, IFrame
from functools import partial
import datetime
import numpy as np
import utils.widgets_handler as widgets_handler
import utils.coordinates_retriver as coordinates_retriver
import utils.data_preprocess as data_preprocess
import utils.charts as charts
import warnings
warnings.filterwarnings("ignore", category=RuntimeWarning)
This cell sets up the initial state and interface for the interactions:
country_list = widgets_handler.read_json_to_sorted_dict('countries.json')
months = widgets_handler.read_json_to_dict('months.json')
timescales = widgets_handler.read_json_to_dict('timescales.json')
subset_area = None
bounding_box = (None, None, None, None)
active_btn = None
selected = {
"country": None,
"adm1_subarea": None,
"adm2_subarea": None,
"timescale": None,
"month": None,
"year": None,
"year_range": None
}
placeholders = {
"country": "no country selected...",
"adm1_subarea": "no adm1 subarea selected...",
"adm2_subarea": "no adm2 subarea selected...",
"timescale": "no timescale selected...",
"month": "no month selected...",
"year": "no year selected..."
}
widgets_handler.save_selection(placeholders)
Now we sets up and configures the widgets:
# Custom style and layout for descriptions and dropdowns
style = {'description_width': '150px'}
dropdown_layout = Layout(width='400px', display='flex', justify_content='flex-end')
range_layout = Layout(width='400px')
btn_layout = Layout(width='400px')
# Dropdown for countries
country_names = [country['name'] for country in country_list]
country_selector = widgets.Dropdown(
options=[placeholders['country']] + country_names,
description='Select a country:',
style=style,
layout=dropdown_layout
)
# Dropdown for subareas, initially empty
adm1_subarea_selector = widgets.Dropdown(
options=[placeholders['adm1_subarea']],
description='a subarea of first level:',
style=style,
layout=dropdown_layout
)
adm2_subarea_selector = widgets.Dropdown(
options=[placeholders['adm2_subarea']],
description='or of second level:',
style=style,
layout=dropdown_layout
)
# Dropdown for timescales
timescale_selector = widgets.Dropdown(
options=[placeholders['timescale']] + list(timescales.keys()),
description='Select a timescale:',
style=style,
layout=dropdown_layout
)
# Dropdown for months
month_selector = widgets.Dropdown(
options=[placeholders['month']] + list(months.keys()),
description='Select a month:',
style=style,
layout=dropdown_layout
)
# Dropdown for years
current_year = datetime.datetime.now().year
years_options = [str(year) for year in range(1940, current_year + 1)]
year_selector = widgets.Dropdown(
options=[placeholders['year']] + years_options,
description='Select a year:',
disabled=False,
style=style,
layout=dropdown_layout
)
# SelectionRangeSlider for years
year_range_selector = widgets.SelectionRangeSlider(
options=years_options,
index=(len(years_options) - 1, len(years_options) - 1), # Start and end at the last
description='Select the year range:',
disabled=False,
style=style,
layout=range_layout
)
selectors = {
"country" : country_selector,
"adm1_subarea": adm1_subarea_selector,
"adm2_subarea": adm2_subarea_selector,
"timescale": timescale_selector,
"month": month_selector,
"year": year_selector,
"year_range": year_range_selector
}
month_widgets_btn = widgets.Button(
description='Get data',
disabled=False,
button_style='info', # 'success', 'info', 'warning', 'danger' or ''
tooltip='Click me',
icon='filter', # (FontAwesome names without the `fa-` prefix)month
layout=btn_layout
)
month_widgets_btn.custom_name='month_widgets_btn'
year_widgets_btn = widgets.Button(
description='Get data',
disabled=False,
button_style='info', # 'success', 'info', 'warning', 'danger' or ''
tooltip='Click me',
icon='filter', # (FontAwesome names without the `fa-` prefix)
layout=btn_layout
)
year_widgets_btn.custom_name='year_widgets_btn'
year_range_widgets_btn = widgets.Button(
description='Get data',
disabled=False,
button_style='info', # 'success', 'info', 'warning', 'danger' or ''
tooltip='Click me',
icon='filter', # (FontAwesome names without the `fa-` prefix)
layout=btn_layout
)
year_range_widgets_btn.custom_name='year_range_widgets_btn'
# Output area for display updates
output_area = widgets.Output()
The functions in the next cell handle user input, process data based on those inputs, and update the notebook interface accordingly:
def setup_observers():
"""
Sets up observers for UI widgets to handle interactions and updates dynamically in a graphical user interface.
This function ensures that observers are only set once using a function attribute to track whether observers have
already been established, enhancing efficiency and preventing multiple bindings to the same event.
Observer is attached to widgets for country selection. This observer triggers specific functions when the 'value' property
of the widgets changes, facilitating responsive updates to the user interface
based on user interactions.
Notes:
- This function uses a custom attribute `observers_set` on itself to ensure observers are set only once.
"""
if not hasattr(setup_observers, 'observers_set'):
# When 'value' changes, update_subareas function will be called to update the dropdown menus
# Create a partial function that includes the additional parameters
country_selector.observe(partial(widgets_handler.update_subareas,
country_list=country_list,
placeholders=placeholders,
adm1_subarea_selector=adm1_subarea_selector,
adm2_subarea_selector=adm2_subarea_selector), 'value')
# Set a flag to indicate observers are set
setup_observers.observers_set = True
def update_and_get_data(btn_name):
"""
Update and retrieve data based on user interactions and selections.
This function handles user interactions, validates selections, calculates geographic bounding boxes,
fetches the corresponding data subset, and updates the output area with relevant information and a map display.
Parameters:
btn_name (str): The name of the button that triggered the interaction.
Global Variables:
selected (dict): Dictionary containing current selections for various parameters.
placeholders (dict): Dictionary of placeholder values.
output_area (OutputArea): The output area widget to display messages and results.
subset_data (xarray.DataArray): Subset of data fetched based on the bounding box.
index (str): Index for the subset data, constructed from timescale value.
bounding_box (tuple): Bounding box coordinates (min_lon, min_lat, max_lon, max_lat) for the selected area.
active_btn (str): The name of the currently active button.
Steps:
1. Set the active button name.
2. Update the month and year selections based on the button interaction.
3. Validate the current selections.
4. If selections are valid:
a. Clear the output area.
b. Retrieve the geographic boundaries for the selected area.
c. Calculate the bounding box for the selected area.
d. Fetch the data subset based on the bounding box.
e. Determine the administrative level, selected area name, timescale, and time period.
f. Print information about the uploaded subset data.
g. Display the map with the bounding box and appropriate zoom level.
Notes:
- The function assumes the existence of utility functions within the 'uti' module for handling interactions, validations,
data fetching, and map display.
- The global variables should be properly initialized before calling this function.
"""
global selected, placeholders, output_area, subset_data, index, bounding_box, active_btn
map_display = None
active_btn = btn_name
widgets_handler.month_year_interaction(btn_name, month_selector, year_selector, selected, placeholders)
if widgets_handler.validate_selections(btn_name, selected, selectors, placeholders, output_area):
with output_area:
output_area.clear_output(wait=True)
coordinates = coordinates_retriver.get_boundaries(selected, country_list, placeholders)
# print(coordinates)
bounding_box = coordinates_retriver.calculate_bounding_box(coordinates)
# print(bounding_box)
# sample_coordinates = coordinates[:3] # Showing first 3 coordinates for brevity
# print('Original Coordinates Sample: ', sample_coordinates)
# print('Bounding Box: ', bounding_box)
# Fetching data using the bounding box
subset_data = data_preprocess.get_xarray_data(btn_name, bounding_box, selectors, placeholders, months, timescales)
index = f"SPEI{timescales[selectors['timescale'].value]}"
adm_level, selected_area = widgets_handler.get_adm_level_and_area_name(selected, placeholders)
timescale = selected['timescale']
time_period = widgets_handler.get_period_of_time(btn_name, selected, placeholders)
print(f"SPEI subset data uploaded for {selected_area}, administrative level {adm_level}, timescale {timescale}, period {time_period}")
zoom_start = 4
if adm_level == 'ADM1' or adm_level == 'ADM2':
zoom_start = 8
map_display = coordinates_retriver.display_map(bounding_box, zoom_start)
map_iframe = coordinates_retriver.display_map_in_iframe(map_display)
display(map_iframe)
# Set up widget interaction
def on_button_clicked(btn):
update_and_get_data(btn.custom_name)
# Setup observers
setup_observers()
It's time for our first analysis. We will selected a geographic area, a... Regarding the choice of the area, please take into account that the larger the area, the more computational power and time it will take to retrieve the data. So, if your device is not powerful, choose smaller areas, such as second-level subareas.
If you click the 'Get data' button before choosing the necessary options from the dropdown menu, a message will be displayed under the widgets' block explaining what you missed.
If all the selections are made, you will receive three messages:
# Update existing selectors
previous_selection = widgets_handler.read_json_to_dict('selection.json')
# Set up widgets with previous settings
country_selector.value = previous_selection.get('country', placeholders['country'])
adm1_subarea_selector.value = previous_selection.get('adm1_subarea', placeholders['adm1_subarea'])
adm2_subarea_selector.value = previous_selection.get('adm2_subarea', placeholders['adm2_subarea'])
timescale_selector.value = previous_selection.get('timescale', placeholders['timescale'])
month_selector.value = previous_selection.get('month', placeholders['month'])
month_widgets_btn.on_click(on_button_clicked)
# Display widgets
display(country_selector, adm1_subarea_selector, adm2_subarea_selector, timescale_selector, month_selector, month_widgets_btn, output_area)
Now you have retrieved the data of your interest in a variable named subset_data[index], where index is the SPEI index you have chosen.
Using the data_preprocess.display_data_details function, you can examine your data to check the following:
data_preprocess.display_data_details(active_btn, selected, subset_data[index])
Country: Japan ADM1 subarea: Hokkaido ADM2 subarea: no adm2 subarea selected... Month: August Timescale: 12 months Time values in the subset: 84 Latitude values in the subset: 18 Longitude values in the subset: 27 Data sample: [[-9999. -9999. -9999. -9999. -9999.] [-9999. -9999. -9999. -9999. -9999.] [-9999. -9999. -9999. -9999. -9999.] [-9999. -9999. -9999. -9999. -9999.] [-9999. -9999. -9999. -9999. -9999.]]
processed_subset, change_summary = data_preprocess.process_datarray(subset_data[index])
print(processed_subset, '\n')
print('Change summary:')
for key, val in change_summary.items():
print(key, val)
<xarray.DataArray 'SPEI12' (time: 84, lat: 18, lon: 27)>
dask.array<where, shape=(84, 18, 27), dtype=float64, chunksize=(1, 18, 27), chunktype=numpy.ndarray>
Coordinates:
* time (time) datetime64[ns] 1940-08-01T06:00:00 ... 2023-08-01T06:00:00
* lon (lon) float64 139.2 139.5 139.8 140.0 ... 145.0 145.2 145.5 145.8
* lat (lat) float64 41.25 41.5 41.75 42.0 42.25 ... 44.75 45.0 45.25 45.5
Attributes:
long_name: Standardized Drought Index (SPEI12)
units: -
Change summary:
invalid_values_replaced 23477
invalid_ratio 57.51
duplicates_removed 0
cftime_conversions 0
# Convert datetime objects to strings and extract the year for the slider labels
maps_year_labels = {i: str(processed_subset.time.values[i].astype('datetime64[Y]')) for i in range(len(processed_subset.time))}
maps_year_slider = widgets.SelectionSlider(
options=[(maps_year_labels[i], i) for i in range(len(maps_year_labels))],
value=0,
description='Year:',
disabled=False,
continuous_update=False,
orientation='horizontal',
readout=True
)
# Use a lambda to pass both ds (processed_subset) and time_index to the function
maps_year_slider_plot = widgets.interactive(lambda time_index: charts.plot_spei_geographical_distribution(processed_subset, time_index), time_index=maps_year_slider)
display(maps_year_slider_plot)
stat_values = data_preprocess.compute_stats(processed_subset)
charts.create_scatterplot(stat_values, timescales, selected, placeholders)
charts.create_boxplot(stat_values, timescales, selected, placeholders)
charts.create_std_dev_bar_chart(stat_values, timescales, selected, placeholders)
# Update existing selectors
previous_selection = widgets_handler.read_json_to_dict('selection.json')
# Set up widgets with previous settings
country_selector.value = previous_selection.get('country', placeholders['country'])
adm1_subarea_selector.value = previous_selection.get('adm1_subarea', placeholders['adm1_subarea'])
adm2_subarea_selector.value = previous_selection.get('adm2_subarea', placeholders['adm2_subarea'])
timescale_selector.value = previous_selection.get('timescale', placeholders['timescale'])
year_selector.value = previous_selection.get('year', placeholders['year'])
year_widgets_btn.on_click(on_button_clicked)
# Display widgets
display(country_selector, adm1_subarea_selector, adm2_subarea_selector, timescale_selector, year_selector, year_widgets_btn, output_area)
data_preprocess.display_data_details(active_btn, selected, subset_data[index])
processed_subset, change_summary = data_preprocess.process_datarray(subset_data[index])
print(processed_subset, '\n')
print('Change summary:')
for key, val in change_summary.items():
print(key, val)
stat_values = data_preprocess.compute_stats(processed_subset, full_stats=False)
charts.create_linechart(stat_values, timescales, selected, placeholders)
# Update existing selectors
previous_selection = widgets_handler.read_json_to_dict('selection.json')
# Set up widgets with previous settings
country_selector.value = previous_selection.get('country', placeholders['country'])
adm1_subarea_selector.value = previous_selection.get('adm1_subarea', placeholders['adm1_subarea'])
adm2_subarea_selector.value = previous_selection.get('adm2_subarea', placeholders['adm2_subarea'])
timescale_selector.value = previous_selection.get('timescale', placeholders['timescale'])
year_range_selector.value = previous_selection.get('year_range')
year_range_widgets_btn.on_click(on_button_clicked)
# Display widgets
display(country_selector, adm1_subarea_selector, adm2_subarea_selector, timescale_selector, year_range_selector, year_range_widgets_btn, output_area)
data_preprocess.display_data_details(active_btn, selected, subset_data[index])
processed_subset, change_summary = data_preprocess.process_datarray(subset_data[index])
print(processed_subset, '\n')
print('Change summary:')
for key, val in change_summary.items():
print(key, val)
stat_values = data_preprocess.compute_stats(processed_subset, full_stats=False)
charts.create_stripechart(stat_values, timescales, selected, placeholders)
charts.create_stripechart(stat_values, timescales, selected, placeholders, 'year')
The list of countries, subareas, and their boundaries is obtained from the geoBoundaries Global Database of Political Administrative Boundaries Database.